RANDOM SPARSE UNARY PREDICATES 

^ ' 1 Introduction. 

0\ 
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Let n be a positive integer, < p < 1. The random unary predicate Un,p is a 

^ ■ probability space over predicates U on [n] = {1, . . . ,n} with the probabihties 

C^ . determined by 

PriC/fx)! = p, 1 < X < n 

vn : 

and the events U{x) being mutually independent over 1 < x < n. Informally, 

we think of flipping a coin for each x to determine if U{x) holds, the coin 

^^ I coming up "heads" with probability p. We shall examine the first order 

l_J ■ language < [n],<,U > with equality, a unary predicate U and a binary 

(— ! , predicate <. Examples of sentences in this language are: 

^ ■ A: 3,U{x) 

B : 3xU{x) A Vy-iy < x 

>; C :3x^yU{x)AU{y)Ay,^[x<zAz<y] 

(>,>,< are natuarally definable from < and equality.) For any such sen- 
tence S we have the probability 

1 — I 

§ : P^[Un,p N S] 

fH . While the use of unary predicates is natural for logicians there are two other 

equivalent formulations that will prove useful. We may think of C/ as a 
subset of [n] and speak about i €z U rather than U{i). Second we may 
associate with U a sequence of zeroes and ones where the i-th term is one if 
U{i) and zero if -^U{i). Thus we may talk of starting at i and going to the 
p\ ' next one. We shall use all three formulations interchangably. 

j^ ■ Ehrenfeucht [??] showed that for any constant p and any sentence S in 

this language 

lim Pr[[/„ p h= S] 

exists. In the case of sentences A and C the limiting probability is one. But 
sentence B effectively states 1 € U, hence its limiting probability is p. We 
get around these edge effects with a new language, consisting of equality, 
a unary predicate U, and a ternary predicate C. We consider C as a built 
in predicate on [n] with C{x, y, z) holding if and only if either x < y < z 



(N 



c^ 



or y<z<x or z<x<y. Thinking of [n] as a cycle, with 1 coming 
directly after n, C{x, y, z) is the relation that x to y to z goes in a clockwise 
direction. For any sentence S in this new language we can again define 
Pr[t/n,p \= S] only in this case Ehrenfeucht's results give a Zero-One Law: 
for any constant p and sentence S 

lim Pr[C/„„ US'] =0 or 1 

We shall call the first language the linear language and the second language 
the circular language. As a general guide, the circular language will tend to 
Zero-One Laws while the linear language, because of edge effects, will tend 
to limit laws. 

We shall not restrict ourselves to p constant but rather consider p = p{n) 
as a function of n. We have in mind the "Evolution of Random Graphs" 
as first developed by Erdos and Renyi. Here as p = p{n) evolves from zero 
to one the unary predicate evolves from holding for no x to holding for all 
X. Analogously (but without formal definition) we have threshold functions 
for various properties. For example, p{n) = n^^ is a threshold property for 
A. When p{n) <^ n^^ almost surely A fails while when p{n) ^ n~^ almost 
surely A holds. In Shelah, Spencer [??] we showed that when p = n~" 
with Q € (0, 1), irrational then a Zero-One Law held for the random graph 
G{n,p) and in Luczak, Spencer [??] we found a near characterization of 
those p = p[n) for which the Zero-One Law held. The situation with random 
unary predicates turns out to be somewhat simpler. Let us say p = p{n) 
satisfies the Zero-One Law for circular unary predicates if for every sentence 
S in the circular language 

Jfi^Pi'[?7„^p(„) \= S] = OoT I 

Here is our main result. 

Theorem 1. Let p = p{n) be such that p{n) G [0, 1] for all n and either 

p{n) <C n~ 

or for some positive integer k 

n k <^ p[n) <ti n *+! 

or for all e > 

n"*^ <^ p{n) and n"*^ ^ 1 — p{n) 



or for some positive integer k 

_i i_ 

n k <^ I — p[n) <C n fc+i 

or 

1 — p{n) <^ n~ 

Then p{n) satisfies the Zero-One Law for circular unary predicates. Inversely 
if p{n) falls into none of the above categories then it does not satisfy the 
Zero-One Law for circular unary predicates. 

The inverse part is relatively simple. Let Ak be the sentence that there 
exist k consecutive elements xi, . . . ,Xk £ U. {x,y are consecutive if for 
no z is C{x,z,y). For k = 2 this is example C. ) Then Pr[74fc] is (for a 
given n) a monotone function of p. When p{n) ~ cn~^'^ and c a positive 
constant the probability Pr[^fe] approaches a limit strictly between zero and 
one. (Roughly speaking, n~^'^ is a threshold function for A^.) Thus for 
p{n) to satisfy the Zero-One law we must have p{n) -C n~^' or p{n) ^ 
n~^' . Further (replacing U with -'U), the same holds with p{n) replaced 
by 1 — p{n). For p{n) to fall between these cracks it must be in one of the 
above five categories. 



Remark. Dolan [??] has shown that p{n) satisfies the Zero-One Law for 
linear unary predicates if and only if p{n) <^ n^^ or n^^ <^ p{n) <^ ■nT^''^ 
or 1 —p(n) <C n~^ or n^^ <^ 1 —p{n) <C n^^'^. For n^-*^'^ ^ p{n) = o(l) he 
considered the following property: 

D : 3^U{x)A[U{x+l)VUix+2)]A^3y[U{y)A[U{y+l)VUiy+2)]Ay < x]AU{x+1) 

(Addition is not in our language but we write x -|- 1 as shorthand for that 
z for which x < z but there is no w with x < w < z.) In our zero-one 
formulation D basically states that the first time we have 11 comes before 
the first time we have 101. This actually has limiting probability .5. This 
example illustrates that limiting probability for linear unary predicates can 
depend on edge effects and not just edge effects looking at C/ on a fixed 
size set 1, . . . , A; or n, n — 1, . . . , n — fe. We defer our results for linear unary 
predicates to section 4. 



When p{n) <^ n ^ the Zero-One Law is trivially satisfies since almost 
surely there is no x for which U{x). Also, \ip{n) satisfies the Zero-One Law 



so does 1 —p{n). Suppose p = p{n) satisfies p{n) ^ n~'^ and 1 —p{n) » n~'^ 
for all e > 0. We show in a section 3 that for every t there is a sequence 
Ai- ■ ■ An with the property that for any sentence A of quantifier depth 
t either all models < \v\,C,U > that contain Ai ■ ■ ■ Aji as a subsequence 
satisfy A or no such models satisfy A. (< [u], C,U > contains Ai- ■ ■ Aji as 
a subsequence if for some 1 < j < n for all 1 < i < i? we have U{j + i) 
if and only if Xj = 1, with j + i defined modulo u.) For p{n) in this range 

< [u],C,U > almost surely contains any such fixed sequence Ai- ■ ■ Ar as a 
subsequence and hence the Zero-One Law is satisfied. This leaves us with 
only one case in Theorem 1, and that will be the object of the next section. 

2 The Main Case. 

Here we let k he a, positive integer and assume 

n~k <^ p{n) <^ n^k+i 

Our object is to show that p{n) satisfies the Zero-One Law for circular unary 
predicates. We shall let t be a fixed, though arbitrary large, positive integer. 
We shall examine the equivalence class under the t-move Ehrenfeucht game 
of the circular model. For the most part, however, we shall examine linear 
models. 

We define (as Ehrenfeucht did) an equivalence class on models M =< 
n,<,U >, two models M, M' being equivalent if they satisfy the same depth 
t sentences or, equivalently, if the t-move Ehrenfeucht game on M, M' is won 
by the "Duplicator" . The addition of models (with M on [n] , M' on [n'] we 
define M + M' on [n-|-n']) yields an addition of equivalence classes. We shall 
denote the equivalence classes by x,y, . . . and the sum hy x + y. Results 
from the beautiful theory of these classes are given in Section 3. 

Let us consider a random unary function U defined on all positive in- 
tegers 1,2,... and with Pr[f/(i)] = p for all i, these events mutually inde- 
pendent. (In the end only the values of U{i) for 1 < i < n will "count" 
but allowing U to be defined over all positive integers allows for a "fictitious 
play" that shall simplify the analysis.) Now for any starting point i examine 
i,i + 1, . . . until reaching the first j (perhaps i itself) for which U{j). Call 
[i,j] the 1-interval of i. (With probability one there will be such a j; ficti- 
tious play allows us to postpone the analysis of those negligible cases when 
no j is found before j > n.) What are the possible Ehrenfeucht values of 

< [hj]i ^) U >? The model must have a series of zeroes (i.e., -■[/) followed 



by one one (i.e., U). There is an s (s = 3* will do) so that all such models 
with at least s zeroes have the same Ehrenfeuct value. We can write these 
values as oi, . . . a^ and b (ai having i — 1 zeroes, b having s zeroes). Call 
this value the 1-value of i. The probability of the 1-value being any partic- 
ular Oj is ~ p while the probability of it being 5 is ~ 1. (All asymptotics 
are as p ^ 0.) We let Ei denote this set of possible 1-values and we split 
Si = Pi U Ti with Pi = {6} and Ti = {ai, . . . , Og}. The 1-values in Ti we 
call 1-transient, the 1-value in Pi we call 1-persistent. 

Now (with an eye toward induction) we define the 2-interval of i = iq. 
Take the 1-interval of i, say [io,^i). Then take the 1-interval of ii, say [ii,i2)- 
Continue until reaching a 1-interval [imiu+i) whose 1-value is 1-transient. 
(Of course, this could happen with the very first interval.) We call [i,iu+i) 
the 2-interval of i. Now we describe the possible 2-values for this 2-interval. 
In terms of Ehrenfeucht value we can write the interval as6-|-6-|-...-|-6-|-ai 
where there are u (possibly zero) 6's. Any b + . . . + b with at least s addends 
b has (see §3.4) the same value, call it B. Let jb denote the sum of j Vs. We 
define the transient 2-values T2 as those of the form jb + ai with < j < s 
and the persistent 2-values P2 as those of the form B + ai. For example, 
let i = 5 and s = 3^ = 243. Then i has 2-value 66 + 022 if, starting at i, 
six times there are at least 243 zeroes before a one and after the sixth one 
there are 21 zeroes and then a one. The 2-value is B + a^ if at least 243 
times there are at least 243 zeroes before the next one and the first time 
two ones appear less than 243 apart they are exactly 5 apart. What are the 
probabilities for i = iq having any particular 2-value? The first 1-interval 
[io,ii) has distribution for 1-value as previously discussed: ~ p for each Oj 
and ~ 1 for b. Having determined the first 1-interval the values starting at 
ii have not yet been examined. Hence the 1-value of the second 1-interval 
will be independent of the 1-value of the first and, in general, the sequence 
of 1-values will be of mutually independent values. Then the transient 2- 
values jb + ai each have probability ^ p while the persistent 2-values B + ai 
will each have probability ^ + o(l). We let P2 denote the set of persistent 
2-values, r2 the set of transient 2-values and E2 = P2UT2 the set of 2-values. 

The 3-value will contain all the notions of the general case. Begin- 
ning at z = io take its 2-interval [io,^i). Then take successive 2-intervals 
[ii,i2)) . . . ) [^u-i) ^u) until reaching an interval [iu^iu+i) whose 2-value is 
transient. The 3-interval for i is then [i.,iu+i). Let xi, . . . ,Xu,y„+i be 
the 2-values for the successive intervals. Fromthe procedure all Xi € P2 
while Hu+i G T2. Now consider (see §3.1)the Ehrenfeucht equivalence classes 
(again with respect to a t-move game) over SP2. {TiA is the set of strings 



over alphabet A.) Let a be the equivalence class for the string xi • • -Xu, 
then the 3- value of i is defined as the pair (3 = ayu+i- We let E^ be the 
set of all such pairs and we call (3 persistent (and place it in P3) if a is a 
persistent state (as defined in §3.2)in SP2; otherwise we call /3 transient and 
place it in T3. If xi • • • x„ and x'^ • • • x'^, are equivalent as strings in P2 then 
xi + . . . + x„ and x'^ + . . . + x'^, have (as shown in §3.4) the same Ehrenfeucht 
value. So the 3- value of i determines the Ehrenfeucht value of the 3-interval 
of i though possibly it has more information. What are the probabilities 
for the 3- value of i? Again we get a string of 2-values Z1Z2 ■ ■ ■ whose values 
are mutually independent and we stop when we hit a transient 2-value. We 
shall see (in the course of the full induction argument) that the probability 
of having 3-value (3 is ^ Cjj for persistent /3 and ~ C/^p for transient /?. 

Now let us define /c-interval and A;-value, including the split into persis- 
tent and transient A;- values by induction on k. Suppose Ek,Pk, Tfc have been 
defined. Beginning at z = io Ist [io,h) be the A;-interval and then take suc- 
cesive /c-intervals [11,12), ■■■ ,[iu-i,iu) until reaching a /c-interval [iu,iu+i) 
with transient fe-value. Then [i,iu+i) is the /c -|- 1-interval of i. (Incidentally, 
suppose U{i). Then [i,i + 1) is the 1-interval of i which is transient. But 
then [i,i + 1) is the 2-interval of i and is transient. For all k [i,i + 1) is 
the /c-interval of i and is transient.) Let xi, . . . ,Xu-,Vu+i be the succesive 
/c- values of the intervals. Let a be the equivalence class of xi • • • x^ in SP^. 
Then i has k + 1-value (3 = ai/u+i- This value is persistent if a is persistent 
and transient if a is transient. This defines i^fe+i, Pfc+i, 7\.+i, completing the 
induction. Our construction has assured that the /c-value of i determines 
the Ehrenfeucht value of the /c-interval of i, though it may have even more 
information. 

Now let us fix i and look at the distribution of its /c-value V . We 
show, by induction on /c, that for every persistent /3 Pr[y^ = P] = c/j + o(l) 
while for every transient /? Pr[y'^ = (3] = {cp + o{l))p. Here each c/j is 
a positive constant. Assume the result for k and set pp = Pi[V^ = /?] 
for all /3 £ Ek- Let p* be the probability that V is transient so that 
p* rsj cp, c a, positive constant. Let xi, . . . ,Xu,yu+i be the successive k- 
values of the /c-intervals beginning at i, stopping at the first transient value. 
We can assume these values are taken independently from the inductively 
defined distribution on E^ ■ The distribution of the first transient value is the 
conditional distribution of V^ given that V^ is transient so the probability 
that it is some transient y is dy + o(l) where dy = Cy/J2 Cy', the sum over 
all transient y' . Note all dy are positive constants. 

The key to the argument is the distribution for the Ehrenfeucht equiva- 



lence class a for the finite sequence xi- ■ ■ Xu S SP^. Let M be the set of all 
such equivalence classes. Let Lu be the event that precisely u persistent x's 
are found and then a transient y. Then Pr[Lu] = (1 — p*)"p* precisely. For 
/? € Pk let pt = Pp/i^ — p*), the conditional probability that V'' = (3 given 
that V^ is persistent. Note that (as p ^ 0) 

p'^ ^ pp ^ Cp 

Conditioning on L„ the xi,...,x„ are mutually independent with distri- 
butions given by the pt. Define on M a Markov Chain (see §3.3) with 
transition probability pt from and a to a + f3. We let M{p) denote this 
Markov Chain. Observe that the set of states M is independent of p and 
the nonzeroness of the transition probabilities is independent of p E (0, 1) 
though the actual transition probabilities do depend on p. There is a partic- 
ular state O representing the null sequence. Let f{u, a) be the probability 
of being at state a at time u, beginning at O at time zero. Then f{u, a) is 
precisely the conditional distribution for a given L„. But therefore, letting 
W denote the Ehrenfeucht equivalence class, 

oo 

Pr[W = a] = J2fiu,a){l-p*)y* 

Let M° be the Markov Chain on the same set with transition probability cp 
from a to a + P and let f°{u, a) be the probability of going from O to a in 
u steps under M°. Observe that M° is the limit of M{p) as p ^ in that 
taking the limit of any (1-step) transition probability in M{p) as p ^ oo 
gives the transition probability in M". 

Now we need some Markov Chain asymptotics. Assume a is transient. 
We claim (recall p* ~ cp) 



Pt[W = a] 



oo 

. M=0 



P 



and that the interior sum converges. In general the probability of remaining 
in a transient state drops exponentially in u so there exist constants K, e so 
that f°{u,a) < K(l — e)" for all u giving the convergence. Moreover there 
exists ei,e2,i^i so that for all < p < ei we bound uniformly f{u,a) < 
Ki{l — £2)" for all u. Pick £3 < ei so that for < p < e^ we have p* < 2cp. 
For any positive 6 we find U so that for < p < e^ 

E-./(.,«)(l-p-)V ^ f ^^(^ _ ^^^.pZ < Mid - e,f < - 
P ^u P ^2 2 



For any fixed < u < U we have limp_^o fi^, a) = f°{u, a) so that 

,. Y.o<u<ufiu,a)il-p*)''p* .-^ 

hm = = > ct (u,a) 

' ^ Q<u<U 

With U sufficiently large this may be made within 5/2 of cY^'^ f°{u,oi). 
But this holds for 5 arbitrarily small, giving the claimed asymptotics of 

Vi[W = a]. 



Remark. The rough notion here is that the probability of having a tran- 
sient k + 1-value is dominated by having few persistent /c-intervals and then 
a transient /c-interval. The transient 2-intervals all had at most s persistent 
1-intervals. The situation changes with 3-intervals. Recall Bai consisted 
of at least s ones each preceeded by at least s zeroes and then two ones i 
apart. Consider an arbitrarily long grouping of 2-intervals of 2-value Bai 
but, say, with none of the form Ba^, i.e., 1001 not appearing and then, say, 
follow the last one, say Bai, with a one so that the 3- interval ends 111. For 
every u there is a -^ CuP probability of this being the 3-interval with u such 
2-intervals and c^ > but all such 3-intervals would be considered transient 
since a persistent sequence in EP2 must surely contain every value in P2. 



Now suppose a is persistent. Again we have the precise formula 

00 
Y>,[W = a] = Y,f{u,a){l-p*rP* 

only this time it is the tail of the sum that dominates. As a is persistent there 
is a limiting probability L = liniu^oo f°{u,OL) with L > and furthermore 
the M{p) approach M° in the sense that 

L = lim lim f(u,a) 

We claim 

Ft[W = a] =L + o{l) 

For any 5 > there exist e and U so that for p < e and u > U we have 

L- 6 < f{u,a) < L + 6 



Then, asE^=oi(l-P*)V = ^, 

oo 

\FT[W = a]-L\<6j2i'^-P*)y* + iL + l) ^ (l-p*)V 

u=U 0<u<U 

For fixed U the second sum is o(l) (as p* — > 0) while the first sum is less 
than 6 so the entire expression is less than 26 for p sufficiently small. As 6 
was arbitrary this gives the claim. 

Recall that the k + 1-value of the full k + 1-interval is a pair consisting of 
the Ehrenfeucht value W just discussed and the A:-value of the first transient 
type Vu+i- The transient type's value has a limiting distribution which is 
independent of W, for conditional on any L^ the distribution on yu+i is the 
same. All possible y & Tk have a limiting probability dy € (0,1). Hence 
the probability of a /c + 1-value being (3 = ay is simply the product of the 
probabilities and hence approaches a constant if a, and hence /3, is persistent 
and is ~ cp if a, and hence f3 is transient. This completes the inductive 
argument for the limiting probabilities of the /c-values of the /c-intervals. 

We now let L = L^ he the length of the /c-interval of i and find bounds on 
the distribution of L. A simple induction shows that if the sequence 1 • • • 1 
of k ones appears after i then the /c-interval of i ends with this sequence or 
possibly before. Thus we get the crude bound 

Pr[L> fca] < (1-/)" 

so that asymptotically 

Pr[L > ap-''] < e-'^" 

where c is a positive constant. In fact, this gives the correct order of magni- 
tude, L is (speaking roughly) almost always on the order of p~ . We claim 
that there are positive constants et,ct so that 

Pr[L* > etp-'] > ct 

The argument is by induction, for t = 1 the random variable L^ is simply 
the number of trials until a success which occurs with probability p and 
the distribution is easily computable. Assume this true for t and let (as 
previously shown) etp be the asymptotic probability that a t-interval will 
be transient. Pick ft positive with ftCt < .5. With probability at least 
.5, the first 7 = ftp^^ t-intervals after i will be persistent. Conditioning 



on an interval being persistent is conditioning on an event that holds with 
probability 1 — o(l) so that each of these t-intervals will have length at least 
ctP * with probability at least q — o(l). As the lengths are independent 
with conditional probability at least .99 at least ct'y/2 of the intervals have 
length at least €tp~^. Thus with probability at least, say .4 the total length 
1/*+^ is at least ct'yetp~*/2 which is et+ip~^^^^' for an appropriate constant 
et+i, completing the induction. 

Up to now the relation between p and n, the number of integers, has not 
appeared. Recall that p — > and n — > oo so that np'' -^ oo but np'^'^^ -^ 0. 
Now begin at i = iq = 1 and generate the /c-interval [io,ii). Then generate 
the /c-interval [ii,i2) beginning at ii and continue. (We do this with k 
fixed. Even if one of the intervals is transient we simply continue with k- 
intervals. Again we imagine continuing forever through the integers.) Let 
A^ be that maximal u for which i„ — 1 < n, so that we have split [n] into 
N /c-intervals plus some excess. As each sequence of k ones definitely will 
end a /c-interval N is at least the number of disjoint subintervals of k ones. 
Simple expectation and variance calculations show that N > .99np^ almost 
surely. On the other side set, with foresight, c = 4c^ e^ . If A < cnp 
then the sum of the lengths of the first cnp^ /c-intervals would be less than n. 
But these lengths are independent identically distributed variables and each 
length is at least ekP~^ with probability at least c^ so that almost surely at 
least Ckcnp /2 of them would have length at least ekP~ and thus their total 
length would be at least {cckek/2)n > n. That is, almost surely 

Cinp'' < N < C2np^ 

where Ci,C2 are absolute constants. 

Let /3i, . . . , I3n be the /c- values of the /c-intervals generated by this pro- 
cedure. Now we make two claims about this procedure. We first claim that 
almost surely none of the (3i are transient. Each /3j has probability ~ cp of 
being transient so the probability that some A, 1 < i < C2np^ is transient 
is at most ~ {cp)C2np = Q{np ~^^) = o(l). And almost surely N < C2np , 
proving the claim. 

Let Ai ■ ■ ■ Aji be any fixed sequence of elements of Pk- The second claim 
is that almost surely Ai- ■ • Ar appears as a subsequence of the (3 sequence, 
more precisely that almost surely there exists i with l<i<N — Rso that 
/3j+j = Aj for 1 < j < R- (For technical reasons we want the subsequence 
not to start with /?i.) As each /3j has a positive probability of being any 
particular x £ P^ and the f3i are independent and Cinp -^ oo almost surely 



10 



this fixed sequence will appear in the first Cinp f3^s. And almost surely 
N > Cinp^, proving the claim. 

We have a third claim that is somewhat technical. For any 1 < j < fc let 
Pi,. . . ,(3u denote the j-values of the successive j-intervals starting at one, 
where (3u is the last such interval that is in Pj. We know that almost surely 
/?!••• /3„ is persistent in SPj. We claim further that almost surely /32/33 • ■ ■ Pu 
is persistent in SPj. It suffices to show this for any particular j as there are 
only a finite number of them. For any integer A we have u — 1 > A almost 
surely and the probability that ^2' " Pa+i is transient goes to zero with A 
so almost surely (32- ■ ■ Pu is persistent. Let us call [b, c) a super A;-interval 
(for a given U) if it is a /c-interval and further for every 1 < j < k letting 
Pi,. . . ,Pu be the successive j-values of the j-intervals beginning at b and 
stopping with the last persistent value - that then P2P3 • ■ ■ Pu is persistent 
in SPj. So almost surely the /c-interval [l,ii) is a super /c-interval. 

We shall show, for an appropriate sequence Ai,. . . , An, that all U satis- 
fying the above three claims give models < n,C,U > which have the same 
Ehrenfeucht value. 

We first need some glue. Call [a, b) an incomplete /c-interval (with respect 
to some fixed arbitrary U) if the /c-interval beginning at a is not completed 
by 6 — 1. Suppose [a,b) is an incomplete /c-interval and [6, c) is a persistent 
super /c-interval. We claim [a, c) is a persistent /c-interval. The argument 
is by induction on k. For k = 1, [a,b) must consist of just zeroes while 
[b, c) consists of at least s zeroes followed by a one. But then so does [a, c). 
Assume the result for k and let [a, b) be an incomplete k + 1-interval and 
[b, c) be a persistent k + 1-interval. We split [a, b) into a (possibly empty) 
sequence xi, 2:2, . . . ,Xr of persistent /c-intervals followed by (possibly null) in- 
complete /c-interval [a"^, b) with value, say, y. We split \b, c) (renumbering for 
convenience) into a sequence Xr+i, . . . ,Xs, ys+i of /c-intervals, all persistent 
except the last which is transient. Then, by induction, y+Xr+i is a persistent 
/c-interval with some value x'j._^i. Then [a,c) splits into /c-intervals with val- 
ues xi, . . . , Xr,x[.j^i,Xr+2-, ■ ■ ■ , Xs,ys+i- By the super-persistency Xr+2 • ■ ■ Xs 
is persistent in SP^ and hence (see §3.2) so is xi • • • Xrx'^_^_lXr+2 • ■ ■ Xs and 
therefore [a, c) is a persistent /c + 1- interval. 

Now let < [n],C,U > be any model that meets the three claims above, 
all of which hold almost surely for p in this range. We set i = io = 1 and find 
successive /c-intervals [io;^i)) [h,i2)i ■ ■ ■ until [iu-i,iu) and then U on [i„,n] 
gives an incomplete /c-interval. By the third claim [l,ii) is superpersistent 
and so the "interval" [iu, n] U [1, ii) (going around the corner) is /c-persistent. 
Hence we have split [n] (now thinking of it as a cycle with 1 following n ) 

11 



into fc-persistent intervals with /c-values xi,a;2, • • • ,Xu- The /c-value for xi 
may be different from that for [1, ii) but the others have remained the same. 
This sequence contains the sequence Ai- ■ ■ Aji described in §3.5 . But this 
implies (see §3.6) that the Ehrenfeuct value is determined, completing the 
proof. 

3 Background. 

3.1 The Ehrenfeucht Game. 

Let ^ be a fixed finite alphabet (in application A is P^ or {0, 1}) and t a 
fixed positive integer. We consider the space T,A of finite sequences ai • • • a^ 
of elements of A. We can associate with each sequence a model < [u],<, f > 
where / : [u] — > A is given by f{i) = Oj. For completeness we describe the 
t-round Ehrenfeucht Games on sequences oi • • • a^ and a'^- ■ ■ a'^, . There are 
two players, Spoiler and Duplicator. On each round the Spoiler first selects 
one term from either sequencs and then the Duplicator chooses a term from 
the other sequence. Let ii, . . .it be the indices of the terms chosen from the 
first sequence, iq in the q-th. round and let i'l, . . .i[ denote the corresponding 
indices in the second sequence. For Duplicator to win he must first assure 
that ai„ = a':, for each q, i.e. that he selects each round the same letter as 
Spoiler did. Second he must assure that for all a, b 



ia <ib<^i'a < i'b and i^ = % <^ i'^ = i[ 

(It is a foolish strategy for Spoiler to pick an already selected term since 
Duplicator will simply pick its already selected counterpart but this possib- 
lity comes in in the Recursion discussed later.) This is a perfect information 
game so some player will win. Two sequences are called equivalent if Dupli- 
cator wins. Ehrenfeucht showed that this is an equivalence class and that 
two sequences are equivalent if their models have the same truth value on 
all sentences of quantifier depth at most t. We let M denote the set of 
equivalence classes which is known to be a finite set. T,A forms a semigroup 
under concatenation, denoted +, and this operation filters to an operation, 
also denoted +, on M. We use x,y, . . . to denote elements of M: x + y their 
sum; O is the equivalence class of the null sequence which acts as identity. 
We associate a ^ A with the sequence a of length one and its equivalence 
class (which contains only it),, also called a. We let jx denote x + . . . + x 
with j summands. From analysis of the Ehrenfeucht game (see §3.4) it is 
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known that there exists s (for definiteness we may take s = 3*) so that: 

jx = kx for ah j,k > s, x E M 

Example. With ^ = {0, 1} we naturally associate sequences such as 101 
with < {1,2,3},<,/ > with /(I) = l,/(2) = 0, /(3) = 1. The addition of 
101 and 1101 is their concatenation (in that order) 1011101. The first order 
language has as atomic formulas x < y, x = y and f{x) = a for each a (z A. 
The sentence 

3^3y3,f{x) = 1 A f{y) = A f{z) = lAx<yAy<z 

is satisfied by 01110001 but not by 000111000 so these are in difi'erent equiva- 
lence classes with t = 3. We could also write that 101 appears as consecutive 
terms with 

3a:3y3zf(.x) = lA/(y) = OA/(z) = lAx < yAy < zA^3^[{x < wAw < y)V{y < wAw < z)] 

Informally we would just say 3xf{x) = f{x + 1) = f{x + 2) = 1 but the 
quantifier depth is four. 

3.2 Persistent and Transient. 

Definition and Theorem. We call a; G Af persistent if 

yy3zx + y + z = x (1) 

Vy3^z + y + x = x (2) 

3p3sVj;P + y + s = x (3) 

These three properties are equivalent. We call x transient if it is not persis- 
tent. 
Proof of Equivalence. 

(3) =^ (1) : Take z = s, regardless of y. Then 

x + y + z = {p + y + s) + y + s = p+{y + s + y) + s = x 

(1) =^ (3) : Let Rx = {x + v : v e M}. We first claim there exists u e M 
with \Rx + u\ = 1, i.e., all x + y + u the same. Otherwise take u (z M 
with \Rx + u\ minimal and say v,w & Rx + u. As Rx + u (^ Rx we write 
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V = X + ui,w = X + U2- From(l), with y = ui, we have x = v + us and thus 
w = V + U4 with U4 = U3 + U2- Then 

W + SU4 = V + {s + 1)U4 = V + SU4 

Adding SU4 to i?+n sends f, It; to the same element so I i?+u+stt4 1 < |i?+it|, 
contradicting the minimahty. Now say Rx + u = {u^}. Again by (1) there 
exists uq with u^ + uq = x. Then Rx + {u + uq) = {x} so that (3) holds with 
p = x,s = u + Uq. 

By reversing addition (noting that (3) is selfdual while the dual of (1) 
is (2)) these arguments give that (3) and (2) are equivalent, completing the 
proof. 

Let X be persistent and consider v = x+y. Let z be such that x+w + z = 
x for all w. Then for all w v+w+{z+y) = {x+{y+w)+z)+y = x+y = v and 
hence v is persistent. Dually,if x is persistent y + x is persistent. Together 

If X is persistent then wi + x + W2 is persistent 

for any wi,W2 G M. 

From (1) the relation x =ji u defined by 3t,(x + f = u) is an equivalence 
relation on the set of persistent x G M. We let Rx denote the =R-class 
containing x so that 

Rx = {x + v -.v e M} 

From(2) the relation x =l u defined by 3j,(f + x = u) is also an equivalence 
relation on the set of persistent x G M . We let Lx denote the =L-class 
containing x so that Lx = {v + x : v ^ M}. Let x be persistent and let p, s 
(by (3)) be such that p + z + s = x for all x. Setting z = O, x = p + s. Thus 
for all z 

x + z + x = {p + s) + z + {p + s)=p + {s + z+p)+s = x 

Let Rx,Ly be equivalence classes under =ji,=i respectively. Then x + 
y G Rx n Ly. Let z G Rx H Ly. Then there exist a, b with x = z + a and 
y = b + z so that x + y = z + {a + b) + z. But as z is persistent the above 
argument (with z as x and a + b as z) gives z + (a + b) + z = z. Thus 

Rx n Ly = {x + y} for all persistent x, y 



Remarks. Let ^ = {0.1}. A sequence oi • • • a„ is transient if and only if 
there is a sentence Q of quantifier depth at most t so that ai • • • a^ fails Q 
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but there is an extension to ai • • • a„au+i • ■ ■ a^ which satisfies Q such that 
all further extensions ai • • • a^Uv+i ■ • • (iw also satisfy Q. For example, with 
t = 4, let Q be the existence of a block 101. If a sequence does not satisfy 
Q then the extension given by adding 101 does satisfy Q and all further 
extensions will satisfy Q. Thus for oi • • • a^j to be persistent for t = 4 it must 
contain 101 and indeed all blocks of length three. We think of property (3) of 
persistency as indicating that a persistent sequence is characterized by p, its 
prefix, and s, its suffix. There are properties such as 3^/(x) = 1 A -^^yV < x 
that depend on the left side of the sequence, in this case the value /(I). 
There are other properties such as 3xf{x) = 1 A -'^yX < y which depend 
on the right side of the sequence. There will be sequences with values p, s 
for the left and right side respectively so that the Ehrenfeucht value of the 
sequence is now determined, regardless of what is placed in the middle. 



Remarks. Certain sentences Q have the property that if any ai • • • a^ 
satisfies Q then all extensions ai • • • a„au+i ■ ■ ■ ay satisfy Q. The sentence 
that the first term of the sequence is 1 has this property; the sentence that 
the last term of the sequence is 1 does not have this property. Call such 
properties unrighteous, as they (roughly) do not depend on the right hand 
side of the sequence. Sequences with Ehrenfeucht value in a given Rx all 
have the same truth value for all unrighteous properties. Sequences with 
Ehrenfeucht value in a given L^ would all have the same truth value for all 
(correspondingly defined) unleftuous properties. 

3.3 The Markov Chain. 

Now consider a probability distribution over A, selecting each a with nonzero 
probability pa- This naturally induces a distribution over A^, the sequences 
of length u, assuming each element is chosen independently. This then leads 
to a distribution over the equivalence classes M. For all u > 0, x € M let 
Pu{x) be the probability that a random string oi • • • a„ is in class x. On M 
we define a Markov Chain, for each x the transition probability from x to 
X + a being pa- 

In Markov Chain theory the states x G M are split into persistent and 
transient, a state x is persistent if and only if it lies in a minimal closed set. 
We claim Markov Chain persistency is precisely persistency as defined by 
(1), (2), (3). If C is closed and a; G C then Rx C C and Rx is itself closed. 
If X satisfies (1) then Ru = Rx for all u = a; + y G Rx so x is Markov Chain 
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persistent. Conversely if x is Markov Chain persistent then Rx must be 
minimal closed so Ru = Rx ior all u = x + y £ Rx and so x satisfies (1). 

The Markov Chain M restricted to a minimal closed set Rx is aperiodic 
since x + sa € Rx and (x + sa) + a = x + sa. Hence from Markov Chain 
theory when x is persistent liniu_^co R^{x) exists. 

A random walk on M, beginning at O, will with probability one eventu- 
ally reach a minimal closed set Rx and then it must stay in Rx forever. Let 
P[Rx] denote the probability that Rx is the closed state reached. 

3.4 Recursion. 

Again let ^ be a finite alphabet, M the set of equivalence classes of T,A and 
now specify some B C M. As B is also a finite set we can define equivalence 
classes (with respect to the same constant t) on T,B, let M+ denote the set 
of such classes. Now let 61 • • • &„ and bi- ■ -b'^, be equivalent sequences of 
T,B. We claim that 

bi + ... + bu = b[ + ... + b'^, 

as elements of M. Let si, . . . , Su, s'l, . . . , s'^, be specific elements of TiA in the 
repective bi or b'^ classes. It suffices to give a strategy for Duplicator with 
models si + . . . + s^ and s'l + . . . + s'^,. Suppose Spoiler picks an element 
X in, say, some Sj. In the game on T,B we know Duplicator has a winning 
reply to bi of some b'^, . Now Duplicator will pick some x' in s'^, . To decide 
the appropriate x' in s'-, to pick Duplicator considers a subgame on Si and 
s'^,. As these are equivalent Duplicator will be able to find such x' for the at 
most t times that he is required to. 

This general recursion includes the previous statement that for all j, k > 
s and any x G M we have jx = kx. Here B = {x} and this says that 
Duplicator can win the i-move Ehrenfeucht game between a sequence of j x's 
and a sequence of k x's; that is, that < [j], <> and < [k], <> are equivalent - 
a basic result on Ehrenfeucht games. In our argument we apply it inductively 
with A = Pk- We know, inductively, that all /c-intervals having the same 
fe-value X G Pfc have the same Ehrenfeucht value. Now the k + 1-interval 
of i is associated with a sequence xi • • • x^ G SPfe and a "tail" Vu+i G T^- 
We call two such k + 1-intervals equivalent if xi • • • x',^, and x'^^ • • • x^, are 
equivalent in SP^ and yu+i = Vn'+i- Now xi + . . . + x^ = x'j^ + . . . + x'^i and 
so the k + 1-intervals have equal Ehrenfeucht value. 
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3.5 Cycles. 

Again let M be the set of equivalence classes on SA. Now consider cycles 
ai- ■ ■ Qu (thinking of ai following ay) with ai € A and consider equivalence 
classes under the (t + l)-niove Ehrenfeucht game. Here we must preserve 
the ternary clockwise predicate C{x,y,z). Any first move Oj reduces the 
cycle to a linear structure a^ • • • a„ai • • • aj„i of the form < [u],<, f > with 
an Ehrenfeucht value x = Xi. Two cycles are equivalent if they yield the 
same set of values Xi £ M. 

For every persistent x € M let (by (3)) p = Px,Q = Sx be such that 
X = Px+y+Sx for all y G M. Let Px and Sx be fixed sequences (i.e., elements 
of T,A) for these equivalence classes and let Rx be the sequence consisting 
of Sx in reverse order followed by Px- If the cycle oi • • • a^ contains Rx 
as a subsequence then selecting Oj as the first element of Px gives a linear 
structure beginning with Px and ending with Sx , hence of value Px + y + Sx = 

X. 

Let R G T,A be a specific sequence given by the concatenation of the 
above Rx for all persistent x G M. Then we claim i? is a universal sequence in 
the sense that all oi • • • a^ € T,A (for any u) that contain i? as a subsequence 
are equivalent. For every persistent x £ M there is an Oj so that Oj • • • aj_i 
has value x. Conversely every ai belongs to at most one of the Rx creating R 
(maybe none if Oj isn't part of R) and so there will be an Rx not containing 
that Oj. Then in Oj • • • aj_i the subsequence Rx will appear as an interval. 
Hence the value of Oj • • • Oj-i can be written wi+x + W2, which is persistent. 
That is, the values of Oj • • • aj_i are precisely the persistent x and hence the 
class of ai • • • a„ in the circular t + 1-Ehrenfeucht game is determined. 

3.6 Recursion on Cycles. 

Again let A be a finite alphabet, M the set of equivalence classes in T,A and 
specify some B C M. Suppose a cycle ai- ■■ au on A may be decomposed into 
intervals si, . . . ,Sr with Ehrenfeucht values bi- ■ -br- Then the Ehrenfeucht 
value of the cycle bi ■ ■ -br determines the Ehrenfeucht value of ai • • • a^. The 
argument is the same as for recursion on intervals. Let ai • • • a^ and a'l- • • a'^, 
be decomposed into si • • • s^ and s'l- ■ ■ s'^, with Ehrenfeucht values 6i ••• br- 
and bi- ■ -b'j.,. Spoiler picks x in some Sj. In the game on cycles over B 
Duplicator can respond b'-, to 6j. Then Duplicator picks an x' G s'-, so that 
he can win the subgame on Sj and s[,. 

We apply this is §2 with A = {0, 1} and B = Pk- Here the P £ Pk 
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may have more information than the Ehrenfeucht value but this only helps 
Duplicator. 

4 The Linear Model. 

We have already remarked in §1 that Zero-One Laws generally do not hold 
for the linear model < [n],<,U > and that P. Dolan has characterized those 
p = p{n) for which they do. Our main object in this section is the following 
convergence result. 

Theorem 2. Let /c be a positive integer, and S a first order sentence. Then 
there is a constant c = Ck,s so that for any p = p{n) satisfying 

_i ]_ 

n k <^ p{n) <^ n *+! 

we have 

lim FT[Un,p hS] = c 

n— >oo 

Again we shall fix the quantifier depth t of S and consider Ehrenfeucht 
classes with respect to that t. For each /? G P^ let Cfs be the constant defined 
in §2 as the limiting probability that a fc-interval has /c-value /3. Let M be 
the set of equivalence classes of SP^, a Markov Chain as defined in §3, and 
for each =/j-class Rx let P[Rx], as defined in §3, be the probability that a 
random sequence /3i/?2 ■ " " eventually falls into Rx- 

In < [n],<,U > let Pi • ■ • (3n denote the sequence of /c-values of the 
successive fc-intervals, denoted [l,ii), [ii,?2)5 • • •, from 1. 

Set, with foresight, 6 = 10^23"*. 

We shall call U on [n] right nice if it satisfies two conditions. The first is 
simply that all the /3i, . . . , (3]\f described above are persistent. We know from 
§2 that this holds almost surely. The second will be a particular universality 
condition. Let Ai- ■ ■ Aji be a specific sequence in SP^ with the property 
that for every Rx and Ly there exists a g so that 

Ai---AgeLy and Ag+i ■ ■ ■ Ar e Rx 

(We can find such a sequence for a particular choice of Rx and Ly by taking 
specific sequences in T,Pk in those classes and concatenating them. The full 
sequence is achieved by concatenting these sequences for all choices of Rx 
and Ly. Note that as some Ai ■ ■ ■ Ag G Ly the full sequence is persistent.) 
The second condition is that inside any interval [x,x + 5n] C [l,n] there 
exist R consecutive /c-intervals [ii,ii+i), . . . , [ii+/j,ii+ij+i) whose /c-values 



18 



are, in order, precisely Ai,...,yl/j. We claim this condition holds almost 
surely. We can cover [1, n] with a finite number of intervals [y,y + ^n] and it 
suffices to show that almost always all of them contain such a sequence, so it 
suffices to show that a fixed [y, y + |n] has such a sequence. Generating the 
/c-intervals from 1 almost surely a /c-interval ends after y and before y + ^n. 
Now we generate a random sequence /3i • • • on an interval of length -^n. 
But constants do not affect the analysis of §2 and almost surely Ai- ■ ■ Ar 
appears. 

Now on < [n], <, L'^ > define V^ by U'^{i) if and only if U{n + 1 — r). U^ 
is the sequence U in reverse order. Call U left nice if U"^ is right nice. Call 
U nice if it is right nice and left nice. As all four conditions hold almost 
surely, the random [/„ p is almost surely nice. 

Let U he nice and let /3i • • • j3n and /?[••• /?]yr denote the sequences of k- 
values for U and U"^ respectively and let Rx and Rx^ denote their =R-classes 
respectively. (Both exist since the sequences are persistent.) 
Claim. The values Rx and Rx^ determine the Ehrenfeucht value of nice U . 

We first show that Theorem 2 will follow from the Claim. Let Rx,Rx^ 
be any two =R-classes. Let U be random and consider < [6n], <,U >. The 
sequence of /c-values lies in Rx with probability P[Rx] + o(l). The same 
holds for U^ on [5n]. But [/'' examines U on [(1 — 6)n,n] so as 5 < .5 
the values of the =R-classes are independent and so the joint probability of 
the values being Rx and Rx^ respectively is P[Rx]P[Rx^] + o(l). Given the 
Claim < [n],<,U > would then have a value v = v{Rx, Rx^)- As 

Y, p[Rx]p[Rxr] = J2 PiR-] E ^[^-H = 1x1 = 1 

this would give a limiting distribution for the Ehrenfeucht value w on < 

Now for the claim. Fix two models M =<[n],<,U > and M' =< [n'], < 
,[/'>, both nice and both with the same values Rx , Rx^ ■ Consider the t- 
move Ehrenfeucht game. For the first move suppose Spoiler picks ni € M. 
By symmetry suppose m < ^. Let [ir-i,ir) be one of the /c-intervals with, 
say, .bin < ir ^ .52n. We allow Duplicator a "free" move and have him 
select ir. Let /9i • • • (3^ and /?i • • • P'j^, be the sequences of fc- values for M and 
M' respectively. Let z be the class of /3i • • • /3r- Since U is nice this sequence 
already contains Ai- ■ ■ Ar and hence is persistent so z € Rx. Let z' be the 
class of f3r+i ■ ■ ■ Pn- By the same argument z' is persistent. In M' inside of, 
say, [.5n, .bin] we find the block Ai- ■ ■ Ar. By the universality property we 
can split this block into a segment in L^ and another in R^i. Adding more 
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to the left or right doesn't change the nature of this spht. Thus there is 
an interval [i'j./_i,i'j.,) so that P'l- ■ ■ P'ri € L^ and /3^/+i • • • /3^/ € Rz'- Spoiler 
plays i'^/ in response to ir- 

The class of /?i • • • /3r is z and z G Rx- The class z' of /?( • • • P'^, is in 
Lz and ii^;. As z G L^ n i?^:, z = z'. Thus [l,ir) under M and [l,i^/) 
under M' have the same Ehrenfeucht value. Thus Duplicator can respond 
successfully to the at most t moves (including the initial move m) made in 
these intervals. Thus Spoiler may as well play the remaining t — 1 moves 
on Ml =< [ir,n],<,U > and M{ =< [i^/, n'], <,U' >. These intervals have 
lengths ni > ^ and n'^ > ^ respectively. But now M and M' are both nice 
with respect to 5i = 35 - the sequence Ai- ■ ■ Afi still appears inside every 
interval of length 6n < 6ini in M and din'i in M' . Hence we can apply 
the same argument for the second move - for convenience still looking at 
Ehrenfeucht values with respect to the t move game. After t moves we still 
have nice Mt,Ml with respect to 5t < 10~^ so the arguments are still valid. 
But at the end of t rounds Duplicator has won. 
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